Towards Efficient Business Process Clustering and Retrieval: Combining Language Modeling and Structure Matching
نویسندگان
چکیده
Large organizations tend to have hundreds of business processes. Discovering and understanding similarities among business processes can be useful to organizations for a number of reasons including better overall process management and maintenance. In this paper we present a novel and efficient approach to cluster and retrieve business processes. A given set of business processes are clustered based on their underlying topic, structure and semantic similarities. In addition, given a query business process, top k most similar processes are retrieved based on clustering results. In this work, we bring together two not wellconnected schools of work: statistical language modeling and structure matching and combine them in a novel way. Our approach takes into account both high-level topic information that can be collected from process description documents and keywords as well as detailed structural features such as process control flows in finding similarities among business processes. This ability to work with processes that may not always have formal control flows is particularly useful in dealing with real-world business processes which are not always described formally. We developed a system to implement our approach and evaluated it on several collections of industry best practice processes and real-world business processes at a large IT service company that are described at varied levels of formalisms. Our experimental results reveal that the combined language modeling and structure matching based retrieval outperforms structure-matching-only techniques in both mean average precision and running time measures.
منابع مشابه
Information Integration with Warp 10
Warp 10 is a novel approach to B2B-integration combining concepts from semantic data modeling, Natural Language Processing and Schema Matching to define a consolidated data model termed canonical model. The canonical model is built up semi-automatically and grows in size and complexity upon matching with schemas provided by a business community. The business context of a matching task allows fo...
متن کاملEfficient Semantics-Based Compliance Checking Using LTL Formulae and Unfolding
Business process models are required to be in line with frequently changing regulations, policies, and environments. In the field of intelligent modeling, organisations concern automated business process compliance checking as the manual verification is a timeconsuming and inefficient work. There exist two key issues for business process compliance checking. One is the definition of a business ...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملControl Flow Information Analysis in Process Model Matching Techniques
Process model matchers automate the identification of correspondences between process models, i.e., activities that represent similar functionality in different models. This way they support a variety of tasks in business process management, e.g., the management of process model collections, the consolidation of processes, or the re-use of process fragments at design time. To detect corresponde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011